Using Random Indexing to improve Singular Value Decomposition for Latent Semantic Analysis

نویسندگان

  • Linus Sellberg
  • Arne Jönsson
چکیده

We present results from using Random Indexing for Latent Semantic Analysis to handle Singular Value Decomposition tractability issues. We compare Latent Semantic Analysis, Random Indexing and Latent Semantic Analysis on Random Indexing reduced matrices. In this study we use a corpus comprising 1003 documents from the MEDLINE-corpus. Our results show that Latent Semantic Analysis on Random Indexing reduced matrices provide better results on Precision and Recall than Random Indexing only. Furthermore, computation time for Singular Value Decomposition on a Random Indexing reduced matrix is almost halved compared to Latent Semantic Analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Latent Semantic Indexing using Multiresolution Analysis

Latent semantic indexing (LSI) is commonly used to match queries to documents in information retrieval (IR) applications. It has been shown to improve the retrieval performance, as it can deal with synonymy and polysemy problems. This paper proposes a hybrid approach which can improve result accuracy significantly. Evaluation of the approach based on using the Haar wavelet transform (HWT) as a ...

متن کامل

Evaluation of Background Knowledge for Latent Semantic Indexing Classification

This paper presents work that evaluates background knowledge for use in improving accuracy for text classification using Latent Semantic Indexing (LSI). LSI’s singular value decomposition process can be performed on a combination of training data and background knowledge. Intuitively, the closer the background knowledge is to the classification task, the more helpful it will be in terms of crea...

متن کامل

Clustering and Latent Semantic Indexing Aspects of the Singular Value Decomposition

This paper discusses clustering and latent semantic indexing (LSI) aspects of the singular value decomposition (SVD). The purpose of this paper is twofold. The first is to give an explanation on how and why the singular vectors can be used in clustering. And the second is to show that the two seemingly unrelated SVD aspects actually originate from the same source: related vertices tend to be mo...

متن کامل

Latent Semantic Indexing by Self-organizing Map

An important problem for the information retrieval from spoken documents is how to extract those relevant documents which are poorly decoded by the speech recognizer. In this paper we propose a stochastic index for the documents based on the Latent Semantic Analysis (LSA) of the decoded document contents. The original LSA approach uses Singular Value Decomposition to reduce the dimensionality o...

متن کامل

Dimensionality Reduction Techniques for Document Clustering- A Survey

Dimensionality reduction technique is applied to get rid of the inessential terms like redundant and noisy terms in documents. In this paper a systematic study is conducted for seven dimensionality reduction methods such as Latent Semantic Indexing (LSI), Random Projection (RP), Principle Component Analysis (PCA) and CUR decomposition, Latent Dirichlet Allocation(LDA), Singular value decomposit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008